pacman::p_load(sf,tmap,spdep,tidyverse, dplyr, mapview, sfdep, stplanr)Take_home_Ex1
Overview
The recent shift in payment being made more digital, companies and organisations can more easily collect data and information that are linked to consumer habits. The transportation industry including public transport such as buses has also lean into this phenomenon. The information collected include travelling patterns that can help companies plan for more efficient routes or where heavy ridership is to be expected.
Objectives
Exploratory Spatial Data Analysis (ESDA) hold tremendous potential to address complex problems facing society. In this study, you are tasked to apply appropriate Local Indicators of Spatial Association (GLISA) and Emerging Hot Spot Analysis (EHSA) to undercover the spatial and spatio-temporal mobility patterns of public bus passengers in Singapore.
Task
Here we will utilise bus travelling data at different time duration for plotting out geospatial data and analysing them using various statistical tools.
Geovisualisation and Analysis
Computing passenger trips at the hexagonal level for the following time intervals:
- Weekday morning peak, 6am to 9am
- Weekday afternoon peak, 5pm to 8pm
- Weekend/holiday morning peak, 11am to 2pm
- Weekend/holiday evening peak, 4pm to 7pm
Display the geographical distribution using choropleth maps of the hexagons.
Combine all of the passenger trips made by all of the bus stops within a hexagon together
Local Indicators of Spatial Association(LISA) Analysis
Utilise Queen’s contiguity for performing LISA of the passenger trips by origin at hexagonal level Displat the LISA maps of the passenger trips at hexagonal level.
Load Packages and Data
Load packages
Here we will load the packages needed for this exercise and their respective functions - sf: - tmap: - spdep: - tidyverse: - dplyr: - mapview: - sfdep:
Loading data
Loading aspatial table
Here we will read all of the ridership from different bus stops in Oct 2023 and assign it to the variable.
odbus <- read_csv("data/aspatial/origin_destination_bus_202310.csv")We will then extract the information from the following and assign them to different variables.
| Day | Duration | Variable name |
|---|---|---|
| Weekdays | 6am - 9am | weekdayAM_6_9 |
| Weekdays | 5pm - 8pm | weekdayPM_5_8 |
| Weekends/Holidays | 11am - 2pm | weekendAM_11_14 |
| Weekends/Holidays | 4pm - 7pm | weekendPM_4_7 |
weekdayAM_6_9 <- odbus %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
weekdayPM_5_8 <- odbus %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
weekendAM_11_14 <- odbus %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
weekendPM_16_19 <- odbus %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 19) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))Loading Geospatial data
Next we will import all of the bus stops and their coordinates and attached it to the busstop variable.
busstop <- st_read(dsn = "data/geospatial", layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`C:\Users\Lian Khye\Desktop\MITB\Geospatial\geartooth\ISSS624\Take_home_Ex1\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
We will first rename the bus stop column title for easier data joining.
colnames(busstop)[colnames(busstop) == "BUS_STOP_N"] <- "ORIGIN_PT_CODE"We will also import the layout of Singapore for excluding busstops that are not found in Singapore.
mpsz <- st_read(dsn = "data/geospatial",
layer = "MPSZ-2019") %>%
st_transform(crs = 3414)Reading layer `MPSZ-2019' from data source
`C:\Users\Lian Khye\Desktop\MITB\Geospatial\geartooth\ISSS624\Take_home_Ex1\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
After that we will create the hexagons that will create the map layout. The hexagons will be shaped 250 x 250 cell size. All of the hexagons will also be given a grid id name that can be used for identifying each individual grid.
center <- st_centroid(busstop)
area_honeycomb_grid <- st_make_grid(center, cellsize = c(250 * sqrt(3), 250 * 2), what = "polygons", square = FALSE)
honeycomb_grid_sf <- st_sf(area_honeycomb_grid) %>%
mutate(grid_id = 1:length(lengths(area_honeycomb_grid)))Data processing
Assigning individual bus stop to hexagons
First we will assign the bus stop point geometry data to each polygon using st_intersection(). The function assigns all of the points to a polygon by the point-set intersection of two geometries. Additional information here.
valid_busstop <- st_intersection(busstop, mpsz)
busstop_hex <- st_intersection(valid_busstop, honeycomb_grid_sf) %>%
st_drop_geometry()Duplication check
Here we will check for the presence of any duplication before we further process the data.
duplicate1 <- weekdayAM_6_9 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()
duplicate2 <- weekdayPM_5_8 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()
duplicate3 <- weekendAM_11_14 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()
duplicate4 <- weekendPM_16_19 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()We can see which data points are duplicated here.
c(duplicate1,duplicate2,duplicate3,duplicate4)$ORIGIN_PT_CODE
character(0)
$TRIPS
numeric(0)
$ORIGIN_PT_CODE
character(0)
$TRIPS
numeric(0)
$ORIGIN_PT_CODE
character(0)
$TRIPS
numeric(0)
$ORIGIN_PT_CODE
character(0)
$TRIPS
numeric(0)
Finally we only keep data points that are unique using the unique() function.
unique_weekdayAM <- unique(weekdayAM_6_9)
unique_weekdayPM <- unique(weekdayPM_5_8)
unique_weekendAM <- unique(weekendAM_11_14)
unique_weekendPM <- unique(weekendPM_16_19)Trip tabulation
Next we will then join the variables that we created earlier that contains the total number of trips at different time intervals and the busstop_hex variable together using grid_id column title that they have in common. The total number of trips made from each hexagon is then summed up together and placed under a new column named TOT_TRIPS.
count_weekdayAM_6_9 <- left_join(unique_weekdayAM , busstop_hex) %>%
group_by(grid_id) %>%
summarise(TOT_TRIPS = sum(TRIPS))
count_weekdayPM_5_8 <- left_join(unique_weekdayPM , busstop_hex) %>%
group_by(grid_id) %>%
summarise(TOT_TRIPS = sum(TRIPS))
count_weekendAM_11_14 <- left_join(unique_weekendAM , busstop_hex) %>%
group_by(grid_id) %>%
summarise(TOT_TRIPS = sum(TRIPS))
count_weekendPM_16_19 <- left_join(unique_weekendPM , busstop_hex) %>%
group_by(grid_id) %>%
summarise(TOT_TRIPS = sum(TRIPS))Reassign polygon information
We will the reassign the polygon information from the hexagonal map that we have created earlier.
poly_weekdayAM_6_9 <- left_join(honeycomb_grid_sf,count_weekdayAM_6_9)
poly_weekdayPM_5_8 <- left_join(honeycomb_grid_sf,count_weekdayPM_5_8)
poly_weekendAM_11_14 <- left_join(honeycomb_grid_sf,count_weekendAM_11_14)
poly_weekendPM_16_19 <- left_join(honeycomb_grid_sf,count_weekendPM_16_19)Filter for empty trips
Following that we will filter hexagons that have no trips to obtain only valid hexagons for mapping.
grid_weekdayAM <- poly_weekdayAM_6_9 %>%
filter(TOT_TRIPS > 0)
grid_weekdayPM <- poly_weekdayPM_5_8 %>%
filter(TOT_TRIPS > 0)
grid_weekendAM <- poly_weekendAM_11_14 %>%
filter(TOT_TRIPS > 0)
grid_weekendPM <- poly_weekendPM_16_19 %>%
filter(TOT_TRIPS > 0)Choropleth map
Here we will plot the choropleth map for the different time intervals. We will be using tmap_mode(“plot”) to create an interactive map. Although we will be coding in accessories such as the compass, they will not be displayed in the interactive map. However by writing them first, we can display them in subsequent maps once we view them in plot mode.
tmap_mode("view")
mapA <- tm_shape(grid_weekdayAM)+
tm_fill("TOT_TRIPS",
style = "quantile",
palette = "Blues",
title = "Passenger trips") +
tm_layout(main.title = "Passenger trips generated Weekday 6am-9am",
main.title.position = "center",
main.title.size = 0.7,
legend.height = 0.45,
legend.width = 0.35,
frame = TRUE) +
tm_borders(alpha = 0.5) +
tm_compass(type="8star", size = 1) +
tm_scale_bar() +
tm_grid(alpha =0.2) +
tm_credits("Source: Planning Sub-zone boundary from URA\n and Passenger trips data from LTA",
position = c("left", "bottom"))
mapAThe total number of ridership range from 1 to 357043 per hexagon. The range of the data are divided to quantile range bands for clearer distinction between ridership of each hexagon.
mapB <- tm_shape(grid_weekdayPM)+
tm_fill("TOT_TRIPS",
style = "quantile",
palette = "Reds",
title = "Passenger trips") +
tm_layout(main.title = "Passenger trips generated Weekday 5pm-8pm",
main.title.position = "center",
main.title.size = 0.7,
legend.height = 0.45,
legend.width = 0.35,
frame = TRUE) +
tm_borders(alpha = 0.5) +
tm_compass(type="8star", size = 1) +
tm_scale_bar() +
tm_grid(alpha =0.2) +
tm_credits("Source: Planning Sub-zone boundary from URA\n and Passenger trips data from LTA",
position = c("left", "bottom"))
mapBThe total number of ridership range from 1 to 568845 per hexagon.
mapC <- tm_shape(grid_weekendAM)+
tm_fill("TOT_TRIPS",
style = "quantile",
palette = "Greens",
title = "Passenger trips") +
tm_layout(main.title = "Passenger trips generated Weekend/holidays 11am-2pm",
main.title.position = "center",
main.title.size = 0.7,
legend.height = 0.45,
legend.width = 0.35,
frame = TRUE) +
tm_borders(alpha = 0.5) +
tm_compass(type="8star", size = 1) +
tm_scale_bar() +
tm_grid(alpha =0.2) +
tm_credits("Source: Planning Sub-zone boundary from URA\n and Passenger trips data from LTA",
position = c("left", "bottom"))
mapCThe total number of ridership range from 1 to 117609 per hexagon.
mapD <- tm_shape(grid_weekendPM)+
tm_fill("TOT_TRIPS",
style = "quantile",
palette = "Purples",
title = "Passenger trips") +
tm_layout(main.title = "Passenger trips generated Weekend/holidays 4pm-7pm",
main.title.position = "center",
main.title.size = 0.7,
legend.height = 0.45,
legend.width = 0.35,
frame = TRUE) +
tm_borders(alpha = 0.5) +
tm_compass(type="8star", size = 1) +
tm_scale_bar() +
tm_grid(alpha =0.2) +
tm_credits("Source: Planning Sub-zone boundary from URA\n and Passenger trips data from LTA",
position = c("left", "bottom"))
mapDThe total number of ridership range from 1 to 114410 per hexagon.
Plot maps
We can also utilise a plot mode by using tmap_mode(“plot”).
tmap_mode("plot")
mapA
This allows for other accessories to be added such as compass and scales which might be useful depending on the application or use of the map.
Choropleth map discussion
We can see that the total ridership over the weekends is lower than the weekday counterparts despite both being classified as peak hours. This difference is likely due to people travelling to or from work on the weekdays as compared to the weekends where there will be less traffic as people choose to stay at home or electing to travel to districts where it may be more accessible by other transportations such as trains.
LISA
Creating neighbour count column
Before we calculate LISA, we will add the number of neighbours to each dataset for the various grids. We will use st_contiguity() and add a neighbour column to the data points under a new variable name. By default, the Queen method of contiguity will be used for calculating the neighbours.
wm_qA <- grid_weekdayAM %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
.before = 1)
wm_qB <- grid_weekdayPM %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
.before = 1)
wm_qC <- grid_weekendAM %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
.before = 1)
wm_qD <- grid_weekendPM %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
.before = 1)Here we can use the summary() function to take a look at the different regions and the neighbours that they have.
summary(wm_qA$nb)Neighbour list object:
Number of regions: 1777
Number of nonzero links: 7672
Percentage nonzero weights: 0.2429594
Average number of links: 4.317389
12 regions with no links:
1 311 488 649 763 835 1224 1751 1754 1758 1767 1777
Link number distribution:
0 1 2 3 4 5 6
12 54 165 250 392 454 450
54 least connected regions:
2 3 8 22 23 35 36 56 65 70 110 126 172 199 200 209 210 218 235 251 347 372 650 662 696 733 754 804 816 817 842 844 859 896 925 981 1166 1178 1179 1181 1209 1259 1261 1322 1432 1495 1504 1521 1540 1552 1627 1661 1759 1760 with 1 link
450 most connected regions:
29 48 71 78 81 82 87 89 94 95 112 119 124 125 131 135 141 144 148 178 185 186 187 194 195 196 204 205 206 216 217 224 230 241 247 249 256 258 259 265 269 271 272 276 277 278 283 284 285 316 320 331 337 339 340 341 344 349 350 353 354 355 356 361 364 365 368 380 381 382 384 385 396 397 400 407 409 410 426 435 439 456 464 470 473 480 485 486 495 501 502 503 509 513 514 515 521 526 528 529 537 544 556 557 566 570 577 578 581 582 587 588 599 610 611 613 619 622 623 624 629 630 633 634 639 644 645 652 653 657 664 665 668 669 671 672 681 682 685 688 689 698 699 703 708 709 710 715 722 723 728 729 735 740 741 746 750 759 760 764 769 778 779 784 790 795 796 799 807 808 811 819 820 824 825 833 838 847 853 864 865 868 870 880 881 887 898 899 901 905 908 910 914 915 930 934 935 936 950 951 955 956 957 963 968 974 975 976 977 978 979 982 983 989 995 998 999 1000 1003 1004 1010 1013 1014 1017 1018 1020 1021 1022 1023 1029 1030 1034 1035 1036 1037 1039 1040 1041 1046 1055 1056 1059 1060 1061 1064 1065 1069 1074 1078 1079 1086 1087 1088 1090 1094 1095 1096 1098 1101 1102 1103 1109 1112 1113 1116 1119 1129 1130 1132 1134 1143 1144 1145 1146 1157 1158 1159 1170 1171 1177 1183 1184 1185 1202 1204 1205 1207 1219 1220 1221 1227 1228 1229 1230 1231 1237 1238 1242 1243 1244 1249 1253 1255 1256 1269 1272 1280 1281 1282 1285 1287 1288 1294 1295 1296 1301 1303 1309 1314 1315 1316 1320 1327 1328 1333 1335 1336 1337 1341 1347 1348 1354 1359 1361 1364 1366 1367 1369 1373 1380 1381 1382 1383 1384 1385 1389 1392 1393 1394 1395 1396 1401 1403 1406 1407 1408 1409 1416 1417 1420 1421 1422 1423 1424 1425 1427 1429 1435 1436 1437 1445 1449 1450 1451 1459 1463 1468 1482 1485 1493 1496 1500 1506 1511 1514 1534 1535 1542 1543 1547 1549 1556 1557 1562 1563 1571 1572 1580 1581 1583 1584 1585 1589 1592 1593 1596 1597 1599 1600 1606 1609 1610 1615 1618 1619 1620 1621 1625 1633 1642 1649 1654 1656 1657 1665 1666 1668 1669 1672 1679 1681 1682 1690 1692 1694 1696 1698 1702 1703 1704 1706 1708 1709 1713 1716 1722 1723 1726 1727 1730 with 6 links
We see that there are 1779 data points and they have an average of 4.3 neighbours. There are 12 hexagons with 0 neighbours and 450 hexagons with 6 neighbours.
summary(wm_qB$nb)Neighbour list object:
Number of regions: 1778
Number of nonzero links: 7680
Percentage nonzero weights: 0.2429393
Average number of links: 4.31946
12 regions with no links:
1 311 488 649 763 835 1224 1752 1755 1759 1768 1778
Link number distribution:
0 1 2 3 4 5 6
12 53 164 251 394 454 450
53 least connected regions:
2 3 8 22 23 35 36 56 65 70 110 126 172 199 200 201 218 219 235 251 347 372 650 662 696 733 754 804 816 817 842 844 859 896 925 981 1166 1178 1179 1181 1209 1259 1261 1322 1432 1495 1504 1541 1553 1628 1662 1760 1761 with 1 link
450 most connected regions:
29 48 71 78 81 82 87 89 94 95 112 119 124 125 131 135 141 144 148 178 185 186 187 194 195 196 205 206 207 216 217 224 230 241 247 249 256 258 259 265 269 271 272 276 277 278 283 284 285 316 320 331 337 339 340 341 344 349 350 353 354 355 356 361 364 365 368 380 381 382 384 385 396 397 400 407 409 410 426 435 439 456 464 470 473 480 485 486 495 501 502 503 509 513 514 515 521 526 528 529 537 544 556 557 566 570 577 578 581 582 587 588 599 610 611 613 619 622 623 624 629 630 633 634 639 644 645 652 653 657 664 665 668 669 671 672 681 682 685 688 689 698 699 703 708 709 710 715 722 723 728 729 735 740 741 746 750 759 760 764 769 778 779 784 790 795 796 799 807 808 811 819 820 824 825 833 838 847 853 864 865 868 870 880 881 887 898 899 901 905 908 910 914 915 930 934 935 936 950 951 955 956 957 963 968 974 975 976 977 978 979 982 983 989 995 998 999 1000 1003 1004 1010 1013 1014 1017 1018 1020 1021 1022 1023 1029 1030 1034 1035 1036 1037 1039 1040 1041 1046 1055 1056 1059 1060 1061 1064 1065 1069 1074 1078 1079 1086 1087 1088 1090 1094 1095 1096 1098 1101 1102 1103 1109 1112 1113 1116 1119 1129 1130 1132 1134 1143 1144 1145 1146 1157 1158 1159 1170 1171 1177 1183 1184 1185 1202 1204 1205 1207 1219 1220 1221 1227 1228 1229 1230 1231 1237 1238 1242 1243 1244 1249 1253 1255 1256 1269 1272 1280 1281 1282 1285 1287 1288 1294 1295 1296 1301 1303 1309 1314 1315 1316 1320 1327 1328 1333 1335 1336 1337 1341 1347 1348 1354 1359 1361 1364 1366 1367 1369 1373 1380 1381 1382 1383 1384 1385 1389 1392 1393 1394 1395 1396 1401 1403 1406 1407 1408 1409 1416 1417 1420 1421 1422 1423 1424 1425 1427 1429 1435 1436 1437 1445 1449 1450 1451 1459 1463 1468 1482 1485 1493 1496 1500 1506 1511 1514 1535 1536 1543 1544 1548 1550 1557 1558 1563 1564 1572 1573 1581 1582 1584 1585 1586 1590 1593 1594 1597 1598 1600 1601 1607 1610 1611 1616 1619 1620 1621 1622 1626 1634 1643 1650 1655 1657 1658 1666 1667 1669 1670 1673 1680 1682 1683 1691 1693 1695 1697 1699 1703 1704 1705 1707 1709 1710 1714 1717 1723 1724 1727 1728 1731 with 6 links
We see that there are 1780 data points and they have an average of 4.3 neighbours. There are 12 hexagons with 0 neighbours and 450 hexagons with 6 neighbours.
summary(wm_qC$nb)Neighbour list object:
Number of regions: 1784
Number of nonzero links: 7702
Percentage nonzero weights: 0.2419991
Average number of links: 4.317265
11 regions with no links:
288 486 647 760 832 1222 1758 1761 1765 1774 1784
Link number distribution:
0 1 2 3 4 5 6
11 56 164 253 395 451 454
56 least connected regions:
1 2 7 21 22 34 35 55 64 69 109 125 150 170 196 197 205 206 214 230 246 310 346 371 648 659 693 730 751 801 813 814 839 841 856 893 922 978 1164 1176 1177 1207 1257 1259 1320 1432 1442 1493 1496 1505 1544 1557 1634 1668 1766 1767 with 1 link
454 most connected regions:
28 47 70 77 80 81 86 88 93 94 111 118 123 124 130 134 140 143 146 176 183 184 185 191 192 193 201 202 203 212 213 219 225 236 242 244 251 253 254 260 264 266 267 271 272 273 278 279 280 315 319 330 336 338 339 340 343 348 349 352 353 354 355 360 363 364 367 379 380 381 383 384 395 396 399 406 408 409 425 434 438 455 463 468 471 478 483 484 493 499 500 501 507 511 512 513 519 524 526 527 535 542 554 555 564 568 575 576 579 580 585 586 597 608 609 611 617 620 621 622 627 628 631 632 637 642 643 649 650 654 661 662 665 666 668 669 678 679 682 685 686 695 696 700 705 706 707 712 719 720 725 726 732 737 738 743 747 756 757 761 766 775 776 781 787 792 793 796 804 805 808 816 817 821 822 830 835 844 850 861 862 865 867 877 878 884 895 896 898 902 905 907 911 912 927 931 932 933 947 948 952 953 954 960 965 971 972 973 974 975 976 979 980 986 992 995 996 997 1000 1001 1007 1010 1011 1014 1015 1017 1018 1019 1020 1026 1027 1031 1032 1033 1034 1036 1037 1038 1043 1052 1053 1056 1057 1058 1061 1062 1066 1071 1075 1076 1083 1084 1085 1087 1091 1092 1093 1095 1098 1099 1100 1106 1109 1110 1113 1116 1126 1127 1129 1131 1140 1141 1142 1143 1155 1156 1157 1168 1169 1175 1181 1182 1183 1200 1202 1203 1205 1217 1218 1219 1225 1226 1227 1228 1229 1235 1236 1240 1241 1242 1247 1251 1253 1254 1267 1270 1278 1279 1280 1283 1285 1286 1292 1293 1294 1299 1301 1307 1312 1313 1314 1318 1325 1326 1331 1333 1334 1335 1339 1345 1346 1352 1357 1359 1362 1364 1365 1367 1371 1378 1379 1380 1381 1382 1383 1388 1391 1392 1393 1394 1395 1400 1402 1405 1406 1407 1408 1416 1417 1420 1421 1422 1423 1424 1425 1427 1429 1435 1436 1437 1446 1450 1451 1452 1460 1464 1469 1483 1486 1494 1497 1501 1507 1512 1515 1517 1531 1538 1539 1545 1546 1547 1552 1554 1561 1562 1566 1567 1568 1577 1578 1586 1587 1589 1590 1591 1596 1599 1600 1603 1604 1606 1607 1613 1616 1617 1622 1625 1626 1627 1628 1632 1640 1649 1656 1661 1663 1664 1672 1673 1675 1676 1679 1686 1688 1689 1697 1699 1701 1703 1705 1709 1710 1711 1713 1715 1716 1720 1723 1729 1730 1733 1734 1737 with 6 links
We see that there are 1786 data points and they have an average of 4.3 neighbours. There are 11 hexagons with 0 neighbours and 454 hexagons with 6 neighbours.
summary(wm_qD$nb)Neighbour list object:
Number of regions: 1773
Number of nonzero links: 7636
Percentage nonzero weights: 0.2429117
Average number of links: 4.306825
10 regions with no links:
1 293 491 652 762 834 1221 1753 1756 1760
Link number distribution:
0 1 2 3 4 5 6
10 55 166 259 389 448 446
55 least connected regions:
2 3 8 22 23 35 36 56 65 70 110 126 172 198 199 208 209 218 235 251 315 351 376 653 663 697 733 753 803 815 816 841 843 858 895 924 950 977 1163 1175 1176 1206 1256 1258 1319 1431 1441 1496 1505 1543 1556 1633 1667 1761 1762 with 1 link
446 most connected regions:
29 48 71 78 81 82 87 89 94 95 112 119 124 125 131 135 141 144 148 178 185 186 187 193 194 195 204 205 206 216 217 224 230 241 247 249 256 258 259 265 269 271 272 276 277 278 283 284 285 320 324 335 341 343 344 345 348 353 354 357 358 359 360 365 368 369 372 384 385 386 388 389 400 401 404 411 413 414 430 439 443 460 468 473 476 483 488 489 498 504 505 506 512 516 517 518 524 529 531 532 540 547 559 560 569 573 580 581 584 585 590 591 602 613 614 616 622 625 626 627 632 633 636 642 654 655 665 666 670 672 673 682 683 686 689 690 699 700 704 709 710 711 715 722 723 728 729 735 739 740 745 749 758 759 763 768 777 778 783 789 794 795 798 806 807 810 818 819 823 824 832 837 846 852 863 864 867 869 879 880 886 897 898 900 904 907 909 913 914 929 932 933 934 948 949 952 953 954 960 965 972 973 974 975 978 979 985 994 995 996 999 1000 1006 1009 1010 1013 1014 1016 1017 1018 1019 1025 1026 1030 1031 1032 1033 1035 1036 1037 1042 1051 1052 1055 1056 1057 1060 1061 1065 1070 1074 1075 1082 1083 1084 1086 1090 1091 1092 1094 1097 1098 1099 1105 1108 1109 1112 1115 1125 1126 1128 1130 1139 1140 1141 1142 1154 1155 1156 1167 1168 1174 1180 1181 1182 1199 1201 1202 1204 1216 1217 1218 1224 1225 1226 1227 1228 1234 1235 1239 1240 1241 1246 1250 1252 1253 1266 1269 1277 1278 1279 1282 1284 1285 1291 1292 1293 1298 1300 1306 1311 1312 1313 1317 1324 1325 1330 1332 1333 1334 1338 1344 1345 1351 1356 1358 1361 1363 1364 1366 1370 1377 1378 1379 1380 1381 1382 1387 1390 1391 1392 1393 1394 1399 1401 1404 1405 1406 1407 1415 1416 1419 1420 1421 1422 1423 1424 1426 1428 1434 1435 1436 1445 1449 1450 1451 1459 1463 1468 1483 1486 1494 1497 1501 1507 1512 1515 1517 1530 1537 1538 1544 1545 1546 1551 1553 1560 1561 1565 1566 1567 1576 1577 1585 1586 1588 1589 1590 1595 1598 1599 1602 1603 1605 1606 1612 1615 1616 1621 1624 1625 1626 1627 1631 1639 1648 1655 1660 1662 1663 1671 1672 1674 1675 1678 1685 1687 1688 1696 1698 1700 1702 1704 1708 1709 1710 1712 1714 1715 1719 1722 1728 1729 1732 1733 1736 with 6 links
We see that there are 1775 data points and they have an average of 4.3 neighbours. There are 10 hexagons with 0 neighbours and 446 hexagons with 6 neighbours.
Neighbour count
Here we can convert the above neighbour count data into a variable dataframe for more direct comparison
df_countsA <- wm_qA %>%
summarise(vector_length = map_dbl(nb, ~length(.))) %>%
group_by(vector_length) %>%
summarise(count = n()) %>%
st_drop_geometry()
df_countsB <- wm_qB %>%
summarise(vector_length = map_dbl(nb, ~length(.))) %>%
group_by(vector_length) %>%
summarise(count = n())%>%
st_drop_geometry()
df_countsC <- wm_qC %>%
summarise(vector_length = map_dbl(nb, ~length(.))) %>%
group_by(vector_length) %>%
summarise(count = n())%>%
st_drop_geometry()
df_countsD <- wm_qD %>%
summarise(vector_length = map_dbl(nb, ~length(.))) %>%
group_by(vector_length) %>%
summarise(count = n())%>%
st_drop_geometry()
result <- bind_rows(
mutate(df_countsA, dataset = "Weekday 6am to 9am"),
mutate(df_countsB, dataset = "Weekday 5pm to 8pm"),
mutate(df_countsC, dataset = "Weekends/holidays 11am to 2pm"),
mutate(df_countsD, dataset = "Weekends/holidays 4pm to 7pm")
) %>%
rename("number of neighbours" = vector_length, count = count)Compiling neighbour count
We will use the bind_rows() function to group all of the neighbour count data together and rearrange them.
result <- bind_rows(
mutate(df_countsA, dataset = "Weekday 6am to 9am"),
mutate(df_countsB, dataset = "Weekday 5pm to 8pm"),
mutate(df_countsC, dataset = "Weekends/holidays 11am to 2pm"),
mutate(df_countsD, dataset = "Weekends/holidays 4pm to 7pm")
) %>%
rename("number_of_neighbours" = vector_length, count = count)
result <- result %>%
group_by(dataset) %>%
arrange(desc(number_of_neighbours)) %>%
mutate(number_of_neighbours = factor(number_of_neighbours, levels = unique(number_of_neighbours)))Plotting neighbour count bar chart
Next we will use ggplot to plot the bar chart for the number of hexagons with differing neighbour count and the different time interval.
ggplot(result, aes(x = number_of_neighbours, y = count, fill = dataset)) +
stat_summary(fun = "sum", geom = "bar", position = "dodge") +
scale_fill_manual(values = c(
"Weekday 6am to 9am" = "blue4",
"Weekday 5pm to 8pm" = "brown2",
"Weekends/holidays 11am to 2pm" = "palegreen",
"Weekends/holidays 4pm to 7pm" = "mediumpurple1")) 
labs(title = "Number of Neighbours Counts",
x = "Number of Neighbours",
y = "Count") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))NULL
Overall we see a decrease in ridership between weekdays and weekends however the number of bus stops that are utilised remains fairly the same. This may mean an overall decrease in ridership for all bus stops but the bus stops that are utilised for boarding remain the same.
Next we will have to remove all of the data points with 0 neighbours using filter and purrr package.
wm_qA_process <- wm_qA %>%
filter(!purrr::map_lgl(nb, ~ all(. == 0)))
wm_qB_process <- wm_qB %>%
filter(!purrr::map_lgl(nb, ~ all(. == 0)))
wm_qC_process <- wm_qC %>%
filter(!purrr::map_lgl(nb, ~ all(. == 0)))
wm_qD_process <- wm_qD %>%
filter(!purrr::map_lgl(nb, ~ all(. == 0)))Spatial weights
wm_qA_weighted <- wm_qA_process %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
wt = st_weights(nb, style = "W"),
.before = 1)
wm_qB_weighted <- wm_qB_process %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
wt = st_weights(nb, style = "W"),
.before = 1)
wm_qC_weighted <- wm_qC_process %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
wt = st_weights(nb, style = "W"),
.before = 1)
wm_qD_weighted <- wm_qD_process %>%
mutate(nb = st_contiguity(area_honeycomb_grid),
wt = st_weights(nb, style = "W"),
.before = 1)Local Moran’s I
Here we will be using the local_moran() function to calculate the local Moran’s I for each region or county.
lisaA <- wm_qA_weighted %>%
mutate(local_moran = local_moran(
TOT_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
lisaB <- wm_qB_weighted %>%
mutate(local_moran = local_moran(
TOT_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
lisaC <- wm_qC_weighted %>%
mutate(local_moran = local_moran(
TOT_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)
lisaD <- wm_qD_weighted %>%
mutate(local_moran = local_moran(
TOT_TRIPS, nb, wt, nsim = 99),
.before = 1) %>%
unnest(local_moran)The output will be a data fram containing the ii, eii, var_ii, z_ii, p_ii, p_ii_sim and p_folded_sum.
Combined visualisation of local Moran’s I and p-value
Here we will place the maps next to each other.
tmap_mode("plot")
map1A <- tm_shape(lisaA) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of the total number of trips",
main.title.size = 0.8)
map2A <- tm_shape(lisaA) +
tm_fill("p_ii",
breaks = c(0, 0.001, 0.01, 0.05, 1),
labels = c("0.001", "0.01", "0.05", "Not sig")) +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)
tmap_arrange(map1A, map2A, ncol = 2)
tmap_mode("plot")
map1B <- tm_shape(lisaB) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of the total number of trips",
main.title.size = 0.8)
map2B <- tm_shape(lisaB) +
tm_fill("p_ii",
breaks = c(0, 0.001, 0.01, 0.05, 1),
labels = c("0.001", "0.01", "0.05", "Not sig")) +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)
tmap_arrange(map1B, map2B, ncol = 2)
tmap_mode("plot")
map1C <- tm_shape(lisaC) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of the total number of trips",
main.title.size = 0.8)
map2C <- tm_shape(lisaC) +
tm_fill("p_ii",
breaks = c(0, 0.001, 0.01, 0.05, 1),
labels = c("0.001", "0.01", "0.05", "Not sig")) +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)
tmap_arrange(map1C, map2C, ncol = 2)
tmap_mode("plot")
map1D <- tm_shape(lisaD) +
tm_fill("ii") +
tm_borders(alpha = 0.5) +
tm_view(set.zoom.limits = c(6,8)) +
tm_layout(main.title = "local Moran's I of the total number of trips",
main.title.size = 0.8)
map2D <- tm_shape(lisaD) +
tm_fill("p_ii",
breaks = c(0, 0.001, 0.01, 0.05, 1),
labels = c("0.001", "0.01", "0.05", "Not sig")) +
tm_borders(alpha = 0.5) +
tm_layout(main.title = "p-value of local Moran's I",
main.title.size = 0.8)
tmap_arrange(map1D, map2D, ncol = 2)
Visualisation of LISA
Here will will visualise LISA where we can see the presence of outliers and clusters. More information can be found here. The following is a newer method for calculating LISA, and require shorter and more concise steps such as not having to manually form the high-high, high-low, low-high and low-low quadrants. Just make sure that if the data is skewed, we will have to use the median for forming the quadrant.
lisa_sigA <- lisaA %>%
filter(p_ii < 0.05)
tmap_mode("view")
tm_shape(lisaA) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sigA) +
tm_fill("mean") +
tm_borders(alpha = 0.4)lisa_sigB <- lisaB %>%
filter(p_ii < 0.05)
tmap_mode("view")
tm_shape(lisaB) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sigB) +
tm_fill("mean") +
tm_borders(alpha = 0.4)lisa_sigC <- lisaC %>%
filter(p_ii < 0.05)
tmap_mode("view")
tm_shape(lisaC) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sigC) +
tm_fill("mean") +
tm_borders(alpha = 0.4)lisa_sigD <- lisaD %>%
filter(p_ii < 0.05)
tmap_mode("view")
tm_shape(lisaD) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(lisa_sigD) +
tm_fill("mean") +
tm_borders(alpha = 0.4)Additional processing
Passenger flow visualisation
Origin and destination grouping
We can also take a look at passenger glow between the different hexes by plotting the desire lines. We will first need to obtain the total number of trips of each intervals that are grouped by their origin followed by their destination.
des_weekdayAM_6_9 <- odbus %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 6 & TIME_PER_HOUR <= 9) %>%
group_by(ORIGIN_PT_CODE,
DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
des_weekdayPM_5_8 <- odbus %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 20) %>%
group_by(ORIGIN_PT_CODE,
DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
des_weekendAM_11_14 <- odbus %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 11 & TIME_PER_HOUR <= 14) %>%
group_by(ORIGIN_PT_CODE,
DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
des_weekendPM_16_19 <- odbus %>%
filter(DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter(TIME_PER_HOUR >= 17 & TIME_PER_HOUR <= 19) %>%
group_by(ORIGIN_PT_CODE,
DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))Joining destination table and grid
Since we have drop
Using the grid that we have created in the beginning, we will now join the data that contains the destination together with the grid.
combn_des_weekdayAM_6_9 <- left_join(des_weekdayAM_6_9,busstop_hex) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_GRID = grid_id,
DESTIN_BS = DESTINATION_PT_CODE)
combn_des_weekdayPM_5_8 <- left_join(des_weekdayPM_5_8,busstop_hex) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_GRID = grid_id,
DESTIN_BS = DESTINATION_PT_CODE)
combn_des_weekendAM_11_14 <- left_join(des_weekendAM_11_14,busstop_hex) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_GRID = grid_id,
DESTIN_BS = DESTINATION_PT_CODE)
combn_des_weekendPM_16_19 <- left_join(des_weekendPM_16_19,busstop_hex) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_GRID = grid_id,
DESTIN_BS = DESTINATION_PT_CODE)Removing duplicates
We will then remove any duplicates using the unique() function.
uni_des_weekdayAM_6_9 <- unique(combn_des_weekdayAM_6_9)
uni_des_weekdayPM_5_8 <- unique(combn_des_weekdayPM_5_8)
uni_des_weekendAM_11_14 <- unique(combn_des_weekendAM_11_14)
uni_des_weekendPM_16_19 <- unique(combn_des_weekendPM_16_19)Rejoin hexagonal information
We can then rejoin the hexagonal information back using left_join() function.
poly_des_weekdayAM_6_9 <- left_join(uni_des_weekdayAM_6_9 , busstop_hex,
by = c("DESTIN_BS" = "ORIGIN_PT_CODE"))
poly_des_weekdayPM_5_8 <- left_join(uni_des_weekdayPM_5_8 , busstop_hex,
by = c("DESTIN_BS" = "ORIGIN_PT_CODE"))
poly_des_weekendAM_11_14 <- left_join(uni_des_weekendAM_11_14 , busstop_hex,
by = c("DESTIN_BS" = "ORIGIN_PT_CODE"))
poly_des_weekendPM_16_19 <- left_join(uni_des_weekendPM_16_19 , busstop_hex,
by = c("DESTIN_BS" = "ORIGIN_PT_CODE"))Recheck unique data
We will then recheck for unique data points by running unique() again.
uniP_des_weekdayAM_6_9 <- unique(poly_des_weekdayAM_6_9)
uniP_des_weekdayPM_5_8 <- unique(poly_des_weekdayPM_5_8)
uniP_des_weekendAM_11_14 <- unique(poly_des_weekendAM_11_14)
uniP_des_weekendPM_16_19 <- unique(poly_des_weekendPM_16_19)Sum total trips
We will then sum up the total number of trips made from a bus stop of origin to the destination using group_by() for sorting. Putting multiple arguments into the function allows for the sub categorising the data. This way we can track the drop off point of the passengers.
ori_des_weekdayAM_6_9 <- uniP_des_weekdayAM_6_9 %>%
rename(DESTIN_GRID = grid_id) %>%
drop_na() %>%
group_by(ORIGIN_GRID, DESTIN_GRID) %>%
summarise(PEAK = sum(TRIPS))
ori_des_weekdayPM_5_8 <- uniP_des_weekdayPM_5_8 %>%
rename(DESTIN_GRID = grid_id) %>%
drop_na() %>%
group_by(ORIGIN_GRID, DESTIN_GRID) %>%
summarise(PEAK = sum(TRIPS))
ori_des_weekendAM_11_14 <- uniP_des_weekendAM_11_14 %>%
rename(DESTIN_GRID = grid_id) %>%
drop_na() %>%
group_by(ORIGIN_GRID, DESTIN_GRID) %>%
summarise(PEAK = sum(TRIPS))
ori_des_weekendPM_16_19 <- uniP_des_weekendPM_16_19 %>%
rename(DESTIN_GRID = grid_id) %>%
drop_na() %>%
group_by(ORIGIN_GRID, DESTIN_GRID) %>%
summarise(PEAK = sum(TRIPS))Visualisation
We will first remove any intra-hexagonal travel.
R_weekdayAM_6_9 <- ori_des_weekdayAM_6_9[ori_des_weekdayAM_6_9$ORIGIN_GRID!=ori_des_weekdayAM_6_9$DESTIN_GRID,]
R_weekdayPM_5_8 <- ori_des_weekdayPM_5_8[ori_des_weekdayPM_5_8$ORIGIN_GRID!=ori_des_weekdayPM_5_8$DESTIN_GRID,]
R_weekendAM_11_14 <- ori_des_weekendAM_11_14[ori_des_weekendAM_11_14$ORIGIN_GRID!=ori_des_weekendAM_11_14$DESTIN_GRID,]
R_weekendPM_16_19 <- ori_des_weekendPM_16_19[ori_des_weekendPM_16_19$ORIGIN_GRID!=ori_des_weekendPM_16_19$DESTIN_GRID,]Create desire lines
Next we will visualise all of the flow or connections between different bus stops from a subzone to another using the od2line() function from the stplanr package.
flow_weekdayAM_6_9 <- od2line(flow = R_weekdayAM_6_9,
zones = honeycomb_grid_sf,
zone_code = "grid_id")
flow_weekdayPM_5_8 <- od2line(flow = R_weekdayPM_5_8,
zones = honeycomb_grid_sf,
zone_code = "grid_id")
flow_weekendAM_11_14 <- od2line(flow = R_weekendAM_11_14,
zones = honeycomb_grid_sf,
zone_code = "grid_id")
flow_weekendPM_16_19 <- od2line(flow = R_weekendPM_16_19,
zones = honeycomb_grid_sf,
zone_code = "grid_id")Visualise desire lines
We can then visualise this flow using the following code. Since the passenger flow is high, it is better to limit the visualisation. Here we will use a flow passenger flow from hexagons to hexagons of 1500 or more.
tm_shape(grid_weekdayAM) +
tm_polygons() +
flow_weekdayAM_6_9 %>%
filter(PEAK >= 1500) %>%
tm_shape() +
tm_lines(lwd = "PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 0.3)tm_shape(grid_weekdayPM) +
tm_polygons() +
flow_weekdayPM_5_8 %>%
filter(PEAK >= 1500) %>%
tm_shape() +
tm_lines(lwd = "PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 0.3)tm_shape(grid_weekendAM) +
tm_polygons() +
flow_weekendAM_11_14 %>%
filter(PEAK >= 1500) %>%
tm_shape() +
tm_lines(lwd = "PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 0.3)tm_shape(grid_weekendPM) +
tm_polygons() +
flow_weekendPM_16_19 %>%
filter(PEAK >= 1500) %>%
tm_shape() +
tm_lines(lwd = "PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 0.3)